How to Make Great Data Visualisations

Aurélien Goutsmedt

ICHEC Brussels Management School

UC Louvain, ISPOLE

Thomas Laloux

UC Louvain, ISPOLE

May 12, 2025

What makes a great data visualisation?

One thing’s for sure: it’s never a pie chart…

General Considerations

Healy (2018) Data Visualization: A Practical Introduction. Princeton University Press.

  • General discussion on commong issues regarding data visualizations
  • Applications to social sciences
  • Focus on R package ggplot2 (Wickham, 2016)

Why visualizing data?

  • Exploring your data (identifying patterns and problems, etc…)
    • Extract information intuitively, efficiently, and accurately

  • Revealing relationships in your data
  • Communicating information with precision, in a concise way
  • Highlighting your main findings
  • A more sociological dimension: bolstering credibility
  • but, it is not sufficient to “look at data” for sound analysis—and it comes with some risks

Edward Tufte’s main principle

Graphical excellence is the well-designed presentation of interesting data—a matter of substance, of statistics, and of design. … [It] consists of complex ideas communicated with clarity, precision, and efficiency. … [It] is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. … [It] is nearly always multivariate. … And graphical excellence requires telling the truth about the data. (Tufte, 1983, p. 53)

Three hurdles to great data visualisation

  • Bad taste: unappealing aestethics and bad design
  • Bad data: misuse of data, errors or missing values, etc.
  • Perception issues: how people perceive and process what they are looking at differs depending on cognitive and contextual factors

Bad tastes

  • Why it’s bad?
    • Duplicated information
    • Unnecessary complexity and design features
    • Difficulty to extract the information correctly

Bad data (use)

  • Why it’s bad?
    • cherry-picking of data
    • Misleading the audience
  • Multiply types of visualizations, vary scales, etc. to avoid cheating yourself

Perception issues: some examples

Perception issues: colors

  • Hue: What color is it?
    • Red, blue, green, yellow, etc.
  • Chroma (or Saturation): How pure or intense is the color?
    • washed-out or grayish vs. vivid or rich
  • Luminance (or Lightness/Brightness): How light or dark is the color?
    • close to white vs. close to black

Perception issues: shapes and scales

Part 2: Practical Data Visualization with ggplot2

What is ggplot2 ?

  • ggplot2 is a data visualization package for R developed by Hadley Wickham (2016).
  • It lets you build complex plots from simple building blocks.
  • Inspired by the Grammar of Graphics (Wilkinson, 2005) — a system for describing and thinking about graphics.
    • The Grammar of Graphics describes a consistent syntax for building complex graphics by combining simple components
    • This approach shifts the focus from choosing predefined chart types to systematically defining what to show and how to show it.

Why should you use ggplot2?

  • Powerful and flexible
  • Grammar of Graphics:
    • Stems from a coherent logics of visualization, which helps to develop reflexivity
    • Customize almost every visual aspect
  • Intuitive syntax once you understand the grammar
  • Active community, extensive documentation, updates, and extension (packages)
  • Integrated with R and the tidyverse suite

ggplot2 examples

How ggplot2 Works

  • Plots as a combination of different layers
    • ggplot2 builds plots by layers, each layer adds a different component to the plot
    • The concept behind ggplot2 divides plot into three different fundamental parts: Plot = data + Mapping + geometry.
    • Then you customize plot details (the theme) with elements e.g., the scales, grid, labels, and legends using specific functions
  • \(\Rightarrow\) Plots are built incrementally by adding components piece by piece
  • coherent logics: push you to ask yourself the good questions : which geom will I use ? With wich data?

How ggplot2 Works: Three main Components

  • Data: The dataset used to generate the plot.
  • Mapping (aestetics): Specifies which variables from the dataset are mapped to visual properties like position, color, or size.
  • Geom: Refers to the geometric objects (e.g., points, lines, bars) that represent data on the plot \(\Rightarrow\) The type of plot you want to create. You can combine multiple geoms: e.g., points with a regression line.
  • Then the details can be improved!

Examples

Presenting the dataset : ISPOLE publications in DIAL

Rows: 2,779
Columns: 50
$ id                                 <chr> "boreal:281323", "boreal:299186", "…
$ type_de_publication                <chr> "Article de périodique (Journal art…
$ sous_type_de_publication           <chr> "Article de recherche", "Article de…
$ sous_type_specifique               <chr> "", "", "", "", "", "", "", "", "",…
$ annee                              <chr> "2025", "2025", "2025", "2025", "20…
$ statut_de_publication              <chr> "Accepté/Sous presse", "Publié", "A…
$ type_d_acces                       <chr> "Accès interdit", "Accès libre", "A…
$ entite_s_departement_s             <chr> "UCL - SSH/SPLE - Institut de scien…
$ auteur_s                           <chr> "Dehoux, Amaury", "Bocquet, Nicolas…
$ editeur_s_directeur_s_scientifique <chr> "", "", "", "", "", "", "", "", "",…
$ collaborateur_s                    <chr> "", "", "", "", "", "", "", "", "",…
$ traducteur_s                       <chr> "", "", "", "", "", "", "", "", "",…
$ prefacier_s                        <chr> "", "", "", "", "", "", "", "", "",…
$ promoteur_s                        <chr> "", "", "", "", "", "", "", "", "",…
$ titre                              <chr> "\"Comme nous existons\" de Kaoutar…
$ language                           <chr> "Français", "Anglais", "Français", …
$ editeur_commercial                 <chr> "", "Regulation & Governance", "", …
$ lieu_d_edition                     <chr> "", "Australia", "", "Den Haag", ""…
$ pagination                         <chr> "Vol. 79, no. 2, p. à paraître (202…
$ doi                                <chr> "", "https://doi.org/10.1111/rego.1…
$ url                                <chr> "http://hdl.handle.net/2078.1/28132…
$ url_pubmed                         <chr> "", "", "", "", "", "", "", "", "",…
$ titre_du_periodique                <chr> "French Studies", "Regulation & Gov…
$ titre_abrege                       <chr> "", "", "", "", "", "", "", "", "",…
$ issn                               <chr> "0016-1128", "1748-5991", "1777-580…
$ eissn                              <chr> "1468-2931", "", "", "2334-3745", "…
$ ponderation                        <chr> "", "", "", "", "", "", "", "", "",…
$ peer_review                        <chr> "Peer-reviewed", "Peer-reviewed", "…
$ mention_d_edition                  <chr> "", "", "", "", "", "", "", "", "",…
$ collection                         <chr> "", "", "", "", "", "", "", "", "",…
$ isbn                               <chr> "", "", "", "", "", "", "", "", "",…
$ auteur_s_de_l_ouvrage_hote         <chr> "", "", "", "", "", "", "", "", "",…
$ titre_de_l_ouvrage_hote            <chr> "", "", "", "", "", "", "", "", "",…
$ nom_lieu_date_de_la_conference     <chr> "", "", "", "", "", "", "", "", "",…
$ jury                               <chr> "", "", "", "", "", "", "", "", "",…
$ date_de_defense                    <chr> "", "", "", "", "", "", "", "", "",…
$ pays_organisme_brevets             <chr> "", "", "", "", "", "", "", "", "",…
$ numero_brevets                     <chr> "", "", "", "", "", "", "", "", "",…
$ organisme_et_collection_rapport    <chr> "", "", "", "", "", "", "", "", "",…
$ mot_s_cle_s                        <chr> "Harchi ¦ Ernaux ¦ Postmigration ¦ …
$ mesh                               <chr> "", "", "", "", "", "", "", "", "",…
$ jel                                <chr> "", "", "", "", "", "", "", "", "",…
$ c_re_f                             <chr> "", "", "", "", "", "", "", "", "",…
$ lc                                 <chr> "", "", "", "", "", "", "", "", "",…
$ identifiants_institut_pole         <chr> "GLOBALIT", "", "", "", "", "", "",…
$ financement_institution            <chr> "", "", "", "", "", "", "", "", "",…
$ financement_subvention             <chr> "", "", "", "", "", "", "", "", "",…
$ financement_programme              <chr> "", "", "", "", "", "", "", "", "",…
$ financement_projet                 <chr> "", "", "", "", "", "", "", "", "",…
$ n_titre                            <int> 83, 129, 187, 85, 95, 98, 77, 95, 1…

Example Plot: Number of Publications by Year

Start with the data and the mapping:

ggplot(data = data_ispole, aes(annee))

Example Plot: Number of Publications by Year

Identify the Issue : Bad Encoding?

annee
Delreux, Tom;";";";";";One big conversation: the EU’s climate diplomacy across the international regime complex on climate change: the case of the Paris Agreement Negotiations.;Anglais;";";";";http://hdl.handle.net/2078.1/260592;";";";";";";";";";";";";The European Union in International Affairs, Online, du 26/05/2021 au 28/05/2021;";";";";";European Union ¦ climate diplomacy ¦ international climate politics;";";";";CEE;";";";" boreal:270632;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2021;";";UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;Squevin, Pierre Louis Daniel
Cambré, Bart
Rihoux, Benoît;";";";";";The impact of the Stemtest/Test électoral on voting in the Belgian general elections of 2019;Anglais;";";";";http://hdl.handle.net/2078.1/224165;";";";";";";";";";";";";the Eighth edition of the conference Belgium: The state of the federation, Bruxelles, 19/12/2019;";";";";";";";";";";CESPOL;";";";" boreal:224208;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2019;";";UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;Moyson, Stéphane;";";";";";The individual psychology of policy learning in the advocacy coalition framework: Information use, egocentrism and self-esteem in the European liberalization of Belgian network industries;Anglais;";";";";http://hdl.handle.net/2078.1/224208;";";";";";";";";";";";";General Conference of the European Consortium for Political Research (ECPR), Wroclaw, Poland, du 04/09/2019 au 07/09/2019;";";";";";CMAP/POL;";";";";SPLE;F.R.S.-FNRS;ECODEMO;PDR;ECODEMO boreal:222588;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2019;";";UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;Ait-Chaalal, Amine;";";";";";The interactions between globalization, inequality and (in)security in the European, African and Middle Eastern contexts: traditional views and new problems. ;Anglais;";";";";http://hdl.handle.net/2078.1/222588;";";";";";";";";";";";";New challenges for the 21st Century, Universidade de Brasilia , 02/10/2019;";";";";";Europe Middle East Africa International Relations;";";";";ISPOLE ¦ CECRI;";";";" boreal:243889;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2019;";";UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;André, Lyla;";";";";";The role of the EU as a leading donor in framing the humanitarian response to the Syrian “Refugee crisis” in Lebanon;Anglais;";";";";http://hdl.handle.net/2078.1/243889;";";";";";";";";";";";";EISA Annual Conference A Century of Show and Tell: The Seen and the Unseen of IR, Sofia, Bulgaria, du 10/09/2019 au 14/09/2019;";";";";";European Union - Humanitarian aid - Refugee protection;";";";";European Management Syrian refugee crisis;Université Catholique de Louvain;FSR;ISPOLE / Germac;"The Eu management of the Syrian refugee crisis" boreal:215145;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2019;";Accès libre;UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;Reuchamps,
Van Ingelgom, Virginie
mémoire
Burns, Charlotte;";";";";";The involvement of the European Parliament in UN climate negotiations;Anglais;";";";";http://hdl.handle.net/2078.1/200022;";";";";";";";";";";";";European Union in International Affairs VI, Brussels, du 16/05/2018 au 18/05/2018;";";";";";Climate change ¦ Climate Diplomacy ¦ Conference of the Parties (COP) ¦ European Parliament ¦ European Union ¦ parliamentarization ¦ United Nations Framework Convention on Climate Change (UNFCCC) ¦ climate change ¦ climate diplomacy ¦ Conference of the Parties ¦ European Parliament ¦ European Union ¦ parliamentarization ¦ Unitied Nations Framework Convention on Climate Change;";";";";CEE;";";";" boreal:200030;Communication à un colloque (Conference Paper);Présentation orale avec comité de sélection;";2018;";Accès restreint;UCL - SSH/SPLE - Institut de sciences politiques Louvain-Europe;Delreux, Tom

Example Plot: Number of Publications by Year

(re)Start with the data and the mapping:

data_ispole %>%
  mutate(annee = as.numeric(annee)) %>%
  ggplot(aes(annee))

Example Plot: Number of Publications by Year

Now add the geom

data_ispole %>%
  mutate(annee = as.numeric(annee)) %>%
  ggplot(aes(annee)) +
  geom_bar()

Example Plot: Number of Publications by Year

Now add the geom (bis)

data_ispole %>%
  mutate(annee = as.numeric(annee)) %>%
  ggplot(aes(annee, fill = peer_review)) +
  geom_bar()

Example Plot: Number of Publications by Year

Now add the details

library(openxlsx)
library(ggpubr)
library(hrbrthemes)
library(ggsci)
library(see)
library(gghighlight)

data_ispole <- data_ispole %>% mutate(annee = as.numeric(annee))

Example Plot: Number of Publications by Year

Now add the details

data_ispole %>%
  ggplot(aes(annee)) +
  geom_bar(fill = "darkseagreen", color = "black") +
  scale_x_continuous(breaks = c(2010, 2024)) +
  theme(
    panel.grid.major.y = element_line(color = "black", linetype = "dashed"),
    panel.grid.minor.y = element_line(color = "red", linetype = "dotted")
  )

Example Plot: Number of Publications by Year

Now add the details

data_ispole %>%
  ggplot(aes(annee)) +
  geom_bar(fill = "darkseagreen", color = "black") +
  theme_pubclean() +
  theme(plot.title = element_text(size = 20, hjust = 0.5)) +
  labs(x = NULL, y = NULL, title = "Evolution of ISPOLE's Publication Activity Over Time")

Title Length Across Publication Types

  • What would be a good way to visualize this?
data_ispole %>%
  select(type_de_publication, n_titre) %>%
  glimpse()
Rows: 2,779
Columns: 2
$ type_de_publication <chr> "Article de périodique (Journal article)", "Articl…
$ n_titre             <int> 83, 129, 187, 85, 95, 98, 77, 95, 127, 77, 62, 98,…

Title Length Across Publication Types

  • What would be a good way to visualize this?
data_ispole %>%
  ggplot(aes(x = type_de_publication, y = n_titre))

Visualizing Title Length Across Publication Types

  • Visualize the data through a boxplot
data_ispole %>%
  filter(
    type_de_publication == "Monographie (Book)" |
      type_de_publication == "Contribution à ouvrage collectif (Book Chapter)" |
      type_de_publication == "Article de périodique (Journal article)" |
      type_de_publication == "Thèse (Dissertation)" |
      type_de_publication == "Communication à un colloque (Conference Paper)" |
      type_de_publication == "Document de travail (Working Paper)"
  ) %>%
  ggplot(aes(x = type_de_publication, y = n_titre)) +
  geom_boxplot() +
  theme_pubclean() #
# coord_flip()

Visualizing Title Length Across Publication Types

  • How could we improve this plot?

Visualizing Title Length Across Publication Types

  • Spot the modifications!

Visualizing Title Length Across Publication Types

  • Add Faceting by Another Variable (and Another Geom)

Visualizing Title Length Across Publication Types

  • Add Faceting by Another Variable (and Another Geom)

General Rule

  • (Almost) Always Use Faceting When Showing Differences Across a Variable

General Rule

  • (Almost) Always Use Faceting When Showing Differences Across a Variable

Exercice

On Paper (No Computers)

  • How would you visualize the relationship between the number of conference papers and the number of published outputs (books, journal articles, and book chapters)?

  • How would you compare peer-reviewed and non–peer-reviewed publications in this context?

  • In both cases, how would you structure your ggplot2 code accordingly?

Exercice

Exercice

References

Healy, K. (2018). Data visualization: A practical introduction. Princeton University Press.
Tufte, E. R. (1983). The Visual Display of Quantitative Information (1st edition). Graphics Press USA.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org